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Summary Back-tracking from proofing center interactions of ribosome purines A1492, A1493, and 
G530, monitoring codon:anticodon complementarity, led to reconstruction of a center designed to 
monitor adenosine self-recognition, in an early form of translation, uncovered by equating the time- 
order of codon assignments, in code formation, to amino acid synthesis path-distance. Acquisition of 
Watson-Crick complementarity from a wider, anti-parallel double-helix of self-interacting, AAA:AAA 
codon:anticodon triplets required changes to two interactions, and loss of ‘extended anticodon’ function 
by A37: (i) at codon 5’-site base, an adaptor-B36-2’OH -> N1-A1493 H-bond replaced an A37 (Hoog- 
steen edge) H 2 N6 —> N1-A1493 bond; and (ii) at mid-site, adaptor-B35-2’OH -> G530-N3- and —► 2’OH 
bonds replaced H-bond G530-2’OH -> N7 adaptor-B35 (Hoogsteen face). Proofing center nucleotides 
remained unchanged. These findings indicate that the transition from base-pair self-recognition to 
complementarity occurred within the proofing center of a ribosome with a functioning ratchet. They also 
revealed pre-code translation, based on self-recognition, incorporated an error-suppression feature and 
that the Donohue (pairing-2) double-helix preceded Watson-Crick complementarity. 
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Background 

Bi-directional H-bonds between complementary nucleotides within the anti-parallel poly(pentose- 
phosphate) scaffold of a double-helix (Watson and Crick, 1953; Franklin and Gosling, 1953) were 
recently interpreted as the imprint of an antecedent polynucleotide replicator, with direct inter-base 
(A = A, G = G, U/T = U/T, C = C) 1 recognition (Davis, 2018a,b). Motivating this interpretation was a 
structure principle from investigations on the origin of the genetic code (Davis, 1998 1999a,b, 2002, 
2004, 2005a,b, 2007, 2008a,b,c, 2009, 2011,2013), specifying that a pre-LUCA 2 invariant, such as the 
ubiquitous a-carboxyl in amino acid intermediates, represented the vestige of attachment to a pre¬ 
divergence cofactor, or scaffold. This became apparent when tRNA core structure (Saks and Sampson, 
1995) distribution and pre-divergence phylogenetics revealed amino acid synthesis pathways had 
utilized bifunctional, cofactor/adaptor tRNA during code formation (Davis, 2008b). 

Application of this principle to the invariant phosphate of bis-phosphorylated intermediates in the pre- 
LUCA, autocatalytic reductive pentose-phosphate cycle, whose triose-phosphate intermediates are, 
notably, modified substituents of the spontaneous, autocatalytic formose cycle, uncovered a pre-RNA 
ladder-like replicator, with parallel poly(P) scaffold strands (Davis, 2012, 2015, 2018c). Consistent with 
ribozyme active-site molecular dynamics (Sgrinani and Magistrato, 2012; Sponer et al., 2012), binary 
sequence poly(pentose-phosphate), self-recognition replicators were, plausibly, pre-RNA carbozymes, 
subject to charge attraction (Wachtershauser, 1992), and apparent source of the RNA/DNA anti¬ 
parallel poly(pentose-phosphate) scaffold (Davis, 2017). This provided structural evidence of replication 
having coevolved with the pre-RNA pathways of central metaboism, as.Orgel (2008) envisioned. 

In reconstructing genetic code formation, equating the time-order of codon assignments to amino acid 
synthesis path-length (Davis, 1999, 2008a, 2012, 2013) revealed that codon bases were recruited in a 
5’ —► mid —► 3’ order. Code formation based on this path-distance metric accounted for more than fifty 
diverse features of code structure, including the anomalous codon assignments to Leu 7 and Arg 9 , 3 * . It 
also predicted the structure of a 23 residue (16-residue [4Fe-4S] cofactor-binding segment + 7-residue 

1 A, adenosine; G; guanosine; U/T, uridine/thymidine, C, cytodine; N, unspecified, and P, phosphate. 

2 Last Universal Common Ancestor 

3 Superscripts, number of post-precursor (oxaloacetate, a-ketoglutarate) steps in synthesis pathway; and, three- 

letter amino acid abbreviations. 
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anionic ‘surface’-attachment foot) pre-LUCA ferredoxin antecedent, reconstructed by Norgaard (2009) 
and Norgaard et al. (2009). 

With codon base recruitment in a 5’ —► mid —► 3’ sequence, the shared mid-A in the 16 XAN triplets 4 
assigned in the NH 4 + Fixers Code (first code) to both diacid amino acids (Asp 1 , Glu 1 ), and their amides 
(Asn 2 , Gin 2 ), conforms with pre-code translation on a poly(A) template (Davis, 1999, 2007b). Random 
sequence peptides of homologues Asp 1 and Glu 1 likely resulted initially. Identification of the amino 
acids and their template in pre-code translation plainly represents an advance on previous pre-code 
translation scenarios (Pool et al., 1998; Noller, 2006; Wolf and Koonin, 2007; Zenkin,2012). Moreover, 
this finding embodies a second principle bearing on the origin of the double-helix: the transition from 
self-recognition to complementarity took place in the proofing center of a pre-code ribosome, 
possessing a template-adaptor ratchet. 

Back-tracking from the ribosome proofing center monitoring a complementary codon :anticodon 
(A form) double-helix (Ogle et al, 2001) has led, in this endeavor, to a pre-code proofing center, 
designed to monitor adenosine self-recognition (pairing-2, Donohue, 1956). Consistent with an 
evolutionary relationship between the proofing center for each mode of template-directed translation, 
they show close structural similarity. A conserved anticodon loop nucleotide, A37 (Saks and Conery, 
2007), is also found to have formed an ‘extended anticodon’ (Yarus, 1982) in pre-code translation. 


Pre-code H-bond proofing interactions 

(i) 5’-Site Figure 1 shows the 5’-codon site of the pre-code proofing center reconstructed from 
ribosome high resolution X-ray diffraction (Ogle et al., 2001). It can be seen to replace an H-bond from 
2’OH of anticodon U36 ribose to N1 of ribosome nucleotide A1493, with a bond from H 2 N6 of A37 
(Hoogsteen face), in the anticodon-loop of a proto-‘adaptor RNA’, to A1493 N1. The substitution is 
necessitated by a larger inter-strand distance between the adenosine pair (Al: A36) (Fig. 1b) versus 
the equivalent Watson-Crick pair (Al :U36) (Fig. la) in the codon:anticodon double-helix: glycosidic 
bond (1’C - 1’C) distance, 13.6 ± 0.16 v.10.3 A; □ ± s.e.m., n(triplet sites) = 3. 

4 X, coding site base, N, ambiguous site 
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Fig. 1. 5’-Codon site interactions within the ribosome proofing center, (a) H-bond location 
and direction for the first A:U base pair in a standard complementary codon:anticodon helix, (b) 
Proofing center in pre-code translation with an A:A codon:anticodon pair, formed by self¬ 
recognition. Each model was constructed using Facio 20.1.3 software (Suenaga, 2005) and 
optimized with Gamess 64-2016 (Schmidt et al., 1993). Boxed letters, read clockwise from top, 
denote a ribosome, codon, and adaptor base(s). Bar, 1 A. 








5 


(ii) Mid-Site Monitoring the mid-site in a complementary codomanticodon complex involves the A2 
2’OH group H-bonding with N3 and 2’OH of ribosome A1492 (Ogle et al., 2001). Both bonds also 



Fig. 2. Mid-site H-bonds at the ribosome proofing center, (a) Complementary, and (b) self¬ 
recognition modes of template-reading. Model details appear under Figure 1. 


arise during translation of the self-recognition codon mid-site (Fig. 2b). Comparable G530 N3 and 2’OH 
H-bonds with U35 2’OH, in the complement ry codomanticodon complex (Ogle et al, 2001) are replaced, 
however, by an H-bond from a proto-RNA.adaptor A35 2’OH to G530 N7 (Hoogsteen face). The transition 
to a complementary codomanticodon pair thus resulted in the net addition of a single H-bond. 
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(iii) 3’ Site Ribosome monitoring of the codon 3’- “wobble” site entails a single H-bond: A3 2’OH to 



Fig.3. Codon- 3’ “wobble” site forms a single H-bond and stabilizing bond to a divalent 
metal ion, in the proofing center for both template-reading modes. Refer to Fig. 1 for model 
details. 
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06 (G530). In addition, A3 2’0H bonds to a Mg 2+ ion linked to Pro 44 (E. coli numbering) in ribosome 
protein SI2 and to 02 of pyrimidine C518 (Ogle et al., 2001). Although the H-bond between A3 2’OH 
and G530 06 is plainly admissible, in the purine-era (Fig. 3b), the bonds mediated by Mg 2+ appear 
inadmissible. A restriction to poly(A)-directed polymerization of random sequence peptides of first 
generation amino acids, Asp 1 , Glu 1 (Davis, 2018b), does not exclude some degree of ribosome, and 
adaptor, infiltration by pyrimidines, such as C518. The presence of SI 2, in a pre-code era, however, 
can be discounted; an O atom within the ribonucleotide scaffold, plausibly, bonded initially to the metal 
ion, at the codon 3’-site. 

Before complementarity 

Watson-Crick base pair complementarity underlies replication, transcription, and translation in all known 
forms of life. Bi-directional base pair bonds and double-helix scaffold strands, however, conformed with 
the imprint of a pre-complementarity antecedent, reliant on monomer self-recognition (Davis, 2012, 
2017, 2018a). From ribosome proofing center models, in Figs. 1-3, the transition from pre-code 
translation, with a single AAA:AAA codon-anticodon, to decoding multiple paired triplets, exemplified by 
AAA:UUU, required discarding an ‘extended anticodon’ function of A37, and re-directing a G530 2’OFI 
H-bond; ribosome nucleotides A1492, A1493, and G530 remain unaltered. Monitoring the narrower 
Cl’-Cl’ distance of a purine:pyrimidine pair, in a Watson-Crick double-helix (Fig.4), accounts for these 
changes in the ribosome proofing center. 

Back-tracking from the initial code, identified with comparative amino acid synthesis path-distances 
(Davis, 1999, 2007, 2008a), indicated poly(A) had served as template during the pre-code evolution of 
translation. In view of this, structural evidence of monomer self-recognition in pre-code replication 
(Davis, 2013, 2015, 2017) indicated AAA:AAA formed the codon:anticodon pair in pre-code translation. 

Fig. 4. Complementary and self-paired bases in triplets forming a double-helix (A-type), and a 
ladder structure, (a) A:U base pairs of Watson-Crick form. Inter-strand distance, Cl’-Cl’, 10.4 
A. (b) A:A base pairs in a Donohue self-recognition double-helix. Cl’-Cl’ distance, 13.5 A. 

(c) A:A base pairs in a pre-helix, ladder-like structure, whose polyanionic scaffold could bind 
to a metal ion, or mineral surface. Cl’-Cl’ distance, 14.1 A. Bar, 1 A. 
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A:A base pairs, ‘pairing-2’ in Donohue (1956), yield a 3.1 A wider double-helix (Fig. 4b). The Donohue- 
2 double-helix, according to present findings, preceded the complementary Watson-Crick double-helix. 
H-Bonded G:G base pairs (Donohue ‘pairing-9’), likewise, form a double-helix. With glycosidic bonds 
superimposable on those of an A:A pair, a binary purine-sequence and anti-parallel scaffold appears 
possible (Donohue 2,9 double-helix); its possible role in the transition to complementarity is discounted 
here, with preference for the direct route from poly(A).. 

Invariants in the ancient pathways of central metabolism and nucleotide synthesis (Davis, 2012, 

2013a,b, 2015, 2017) indicate the Donohue-2 double-helix was preceded by ladder-like structures, 
incorporating a form of selfing. Figure 4(c) shows an H-bonded AAA:AAA triplet ladder. The Cl’-Cl’ 
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inter-strand separation is comparable to a Donohue double-helix; but base-pair separation is double. 
Template advancement during translation, according to this, was halved during ribosome evolution. 
Proofing center van der Waals surface, more likely, shaped the template:adaptor triplet pair into a 
double-helix, on which the proto-ribosome ratchet could act. Reduced stacking energy, from the larger 
inter-base spacing, suggests divalent metal ions, or direct attachment to a cationic surface, stabilized 
the nucleotide ladder. 

Encoding first generation amino acids by A-rich triplets (Davis, 1999, 2007) fits with other evidence of 
early adenosine significance: ribosome A1492 and A1493 monitoring of the codon:anticodon, type A 
double-helix, minor groove (Ogle et al., 2001 ) 5 ; A-minor motif stabilization of RNA tertiary structure 
(Battle and Doudna, 2002); adenosine participation in biomolecular energy transfer (Lipmann, 1941), 
and, more broadly, the role attributed to purines in early metabolism (Wachtershauser, 1992). 

Restriction of pre-code translation to a poly(A) template contributed an error suppression feature: 
exclusion of mutations resulting from an unassigned (non-AAA) triplet. Whenever no cognate adaptor 
can read a given triplet, translation permanently stalls (Bretscher et al., 1965). Synthesis of random 
sequence peptides on a pre-informational, poly(A), template thereby contributed to pre-code evolution 
of translation (Moran et al., 2008; Spirin, 2009), unhindered by ‘lethal’ translation-stalling mutations. 

Pre-code translation resulted in the synthesis of peptides containing first-generation amino acids, 
necessarily capable of functioning in a sequence-independent manner. Consistent with this, Asp 1 and 
Glu 1 form on amination of oxaloacetate and a-ketoglutarate, respectively, in the autocatalytic, reductive 
citrate cycle (Wachtershauser, 1992). In addition to being homologous, the diacid amino acids are 
direct precursors of nearly half the amino acids in proteins, indicative of their antiquity, and significance 
as a point of entry of NH 4 + in amino acid synthesis pathways and beyond (Davis, 1999, 2013a). In 
polymeric form, they provided a source of precursor molecules, anchored to a cationic surface, in an 
aqueous environment above pH 4.25 - carboxylate, pKa; broadly resembling polyphosphate, which 
likely functioned as an early chaperone in macromolecular (protein) folding (Gray et al., 2014) and as a 
primal energy source (Westheimer, 1987; Kornberg, 1995). 

5 G530 being the other nucleotide to bond with codon:anticodon bases, highlights the role of purines in the origin 
of translation. 
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Transition to encoded translation 

Synthesis of functional proteins, with a complex residue sequence, became possible and 
advantageous, following template-directed polycondensation of pre-code peptides. Encoded translation 
at the initial NH 4 + Fixer’s Code (Fig. 5) already differs significantly from pre-code translation, on a 
poly(A) template, with conspicuously more template triplets (16 vs. 1) and distinguishable residues and 
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Fig. 5. Transition from pre-code poly(A)-directed translation to NH 4 + Fixers Code. Random 
polycarboxyl-peptides form initially, in this scenario, as an AAA:AAA, then AAA:UUU, codon: 
anticodon pair jointly specify an Asp 1 and Glu 1 residue. Recruitment of a chain-termination, Ter, 
codon, UAA, results in peptides of defined length. Addition of G, and pyrimidine complement, 

C, expanded the code, saturating the NAN triplet set: GAN encoded the diacid amino acids, and 
CAN and AAN their amides. Asp 1 , Glu 1 and Gin 2 have type ID tRNA (bracketed, upper case 
letter), whereas, tRNA Asn is a type IA adaptor - the core group difference is linked to competitive 
displacement of ancestral tRNA AspGln -ID (proto-adaptor AspGIU -1A) from AAA. 


signals (1 vs. 4). This development accompanied the transition from self-recognition, by a single purine 
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(A), to Watson-Crick complementarity in A:U base pairs, initially, and then A:U and G:C base pairs. A 
universal anticodon 5’U rendered NH 4 + Fixer’s codons degenerate at their 3’-site, neutralizing all 
3’ mutations. This also implies that Asp 1 and Glu 1 were ambiguously encoded by GAN, until the final 
(overprinting) stage of code evolution. With all 5’-bases assigned, mutations to an unassigned triplet 
resulted solely from substitutions at the codon mid-A site. One-third of all mutations to the NH 4 + Fixer’s 
Code would, therefore, stall translation, in the manner reported by Bretscher et al. (1965). By 
comparison, a random distribution of 16 assigned triplets, among 48 not-yet-assigned triplets, would 
mutate to an unassigned translation-stalling triplet three-quarters of the time. Clustering all 16 NH 4 + 
Fixer’s codons within the NAN set saturated this set with assigned triplets; reducing the probability of a 
‘lethal’ translation-stalling mutation by 2.29-fold. 

Reconstruction of genetic code formation with the path-distance model (Davis, 1999a, 2007, 2008b) 
and structural features of the double-helix (Davis, 2018a,b) furnished evidence of a self-recognition 
stage in code formation, involving a poly(A) template (Fig. 5). Substitution of AAA:AAA codon:anticodon 
pairing with an AAA:UUU pair (Transitional-1 code, Fig.5) replaces self-recognition with 
complementarity, while retaining translation of the poly(A) template. Admission of the pyrimidine 
complement of A, also led to conversion of amino acid codon, AAA, to a chain termination codon, UAA, 
by a single 5’-base substitution. Recruiting the Ter codon meant peptide length could then be encoded. 
Addition of the G:C base pair (Transitional-2 Code, Fig. 5) led to expansion of the early code, by 
addition the amides, Asn 2 and Gin 2 , of Asp 1 and Glu 1 . 

Concluding Remarks 

Present findings indicate semi-conservative, Watson-Crick complementarity (A —► T —>• A, G —► C —► G) 
derived from conservative, pre-informational (A —> A) self-recognition in a Donohue-2 double-helix 
within the ribosome proofing center. Studies on coevolution of the genetic code with the growth of pre- 
LUCA amino acid synthesis pathways previously uncovered evidence of pre-double-helix, binary 
sequence replicators linked to the autocatalytic reductive pentose-phosphate and citrate cycles. Since 
the former contains triose-phosphate analogues of (non-phosphorylated) constituents in the 
spontaneous, autocatalytic formose cycle, a framework resulted for explaining the origin of complex 
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biopolymers from a 1-carbon source on the early Earth. The emergence of encoded amino acid residue 
sequences, exhibiting ‘non-local order’ to a high degree, from formerly simple, random peptides 
depicted here illustrates the general 'simple —> complex’ direction of molecular evolution, 
notwithstanding the restriction of deterministic mappings to non-increasing complexity. This 
demonstrates the value of mutations contributing to complexity in the dissipation of the scalar forces 
driving molecular evolution (Davis, 1979, 1996a,b,c, 1998, 2017). 

Phosphate is central to the origin of life, within this framework, consistent with its significance in 
bioenergetics, bilayers, replication, transcription, and translation. Abiotic organosynthetic products, 
from space, include glycoaledhyde, racemic mixtures of thermostable amino acids, and various 
hydrocarbons, alternatively, provide compelling evidence for ocean enrichment on the early Earth. Low 
terrestrial phosphate levels (Keefe and Miller, 1995), the lack of a self-organizing principle akin to 
autocatalysis/replication, and a preponderance of autotrophic thermophiles at the deepest branches of 
the tree of life (Woese et al. 1987), on the other hand, favor a non-global site for the origin of life, 
most notably, an alkaline hydrothermal vent. Self-annealing, cis-acting polyanionic, self-replicating 
polymers, are, evidently, equipped to circumvent (Davis, 2015) thermodynamic and kinetic constraints 
seen (DeDuve and Miller, 1991) to rule out the origin of life at a local site, involving charge-attraction. 
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